Do Text-to-Speech Synthesisers Pronounce Correctly? A Preliminary Study

نویسندگان

  • David Gareth Evans
  • E. A. Draffan
  • Abi James
  • Paul Blenkhorn
چکیده

This paper evaluates 4 commercial text-to-speech synthesisers used by dyslexic people to listen to and proof read text. Two evaluators listened to 704 common English words and determined whether the words were correctly pronounced or not. Where the evaluators agree on incorrect pronunciation, the proportion of correct pronunciations for the four synthesisers is in the range 98.9% to 99.6% of the 704 words. The evaluators also listened to the same synthesisers speaking phrases in which there were 44 pairs of homographs and determined whether each instance of the homograph was correctly spoken or not. The level of correctness for the four synthesisers ranged from 76.3% to 91.3%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Synthesis of Code-Mixed Text

Most Text to Speech (TTS) systems today assume that the input text is in a single language and is written in the same language that the text needs to be synthesized in. However, in bilingual and multilingual communities, code mixing or code switching occurs in speech, in which speakers switch between languages in the same utterance. Due to the popularity of social media, we now see code-mixing ...

متن کامل

Automatic phonetisation for Icelandic

As a part of my final thesis in language technology, I created a speech synthesiser using the free MBROLA system. MBROLA is a project designed to make speech synthesisers for as many languages as possible available for free. It does not require a lot of technological prowess for the general user to create such a synthesiser: all that is required is segmented speech data, and the rest is handled...

متن کامل

Comparing text-driven and speech-driven visual speech synthesisers

We present a comparison of a text-driven and a speech driven visual speech synthesiser. Both are trained using the same data and both use the same Active Appearance Model (AAM) to encode and re-synthesise visual speech. Objective quality, measured using correlation, suggests the performance of both approaches is close, but subjective opinion ranks the text-driven approach significantly higher.

متن کامل

On evaluating synthesised visual speech

This paper describes issues relating to the subjective evaluation of synthesised visual speech. Two approaches to synthesis are compared: a text-driven synthesiser and a speech-driven synthesiser. Both synthesisers are trained using the same data and both use the same model for rendering the synthesised visual speech. Naturalness is used as a performance metric, and the naturalness of real visu...

متن کامل

Evolution of Text-to-Speech Systems and Methods of Their Assessment

The paper gives a retrospective of the development of speech synthesis systems, from mechanical synthesisers to computer systems for text-to-speech conversion (TTS) and analyses the perspectives of biomechanical and multimodal TTS systems within dialogue systems addressing higher cognitive levels as well. Special attention is given to the methods for assessment of the quality of synthesised spe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006